Overview

Dataset statistics

Number of variables25
Number of observations29965
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.7 MiB
Average record size in memory200.0 B

Variable types

Numeric22
Categorical3

Alerts

PAY_0 is highly correlated with PAY_2 and 2 other fieldsHigh correlation
PAY_2 is highly correlated with PAY_0 and 7 other fieldsHigh correlation
PAY_3 is highly correlated with PAY_0 and 9 other fieldsHigh correlation
PAY_4 is highly correlated with PAY_0 and 10 other fieldsHigh correlation
PAY_5 is highly correlated with PAY_2 and 8 other fieldsHigh correlation
PAY_6 is highly correlated with PAY_2 and 8 other fieldsHigh correlation
BILL_AMT1 is highly correlated with PAY_2 and 8 other fieldsHigh correlation
BILL_AMT2 is highly correlated with PAY_2 and 10 other fieldsHigh correlation
BILL_AMT3 is highly correlated with PAY_2 and 11 other fieldsHigh correlation
BILL_AMT4 is highly correlated with PAY_3 and 13 other fieldsHigh correlation
BILL_AMT5 is highly correlated with PAY_3 and 13 other fieldsHigh correlation
BILL_AMT6 is highly correlated with PAY_4 and 11 other fieldsHigh correlation
PAY_AMT1 is highly correlated with BILL_AMT1 and 5 other fieldsHigh correlation
PAY_AMT2 is highly correlated with BILL_AMT3 and 5 other fieldsHigh correlation
PAY_AMT3 is highly correlated with BILL_AMT4 and 7 other fieldsHigh correlation
PAY_AMT4 is highly correlated with BILL_AMT4 and 6 other fieldsHigh correlation
PAY_AMT5 is highly correlated with BILL_AMT4 and 5 other fieldsHigh correlation
PAY_AMT6 is highly correlated with BILL_AMT5 and 4 other fieldsHigh correlation
PAY_0 is highly correlated with PAY_2 and 3 other fieldsHigh correlation
PAY_2 is highly correlated with PAY_0 and 4 other fieldsHigh correlation
PAY_3 is highly correlated with PAY_0 and 4 other fieldsHigh correlation
PAY_4 is highly correlated with PAY_0 and 4 other fieldsHigh correlation
PAY_5 is highly correlated with PAY_0 and 4 other fieldsHigh correlation
PAY_6 is highly correlated with PAY_2 and 3 other fieldsHigh correlation
BILL_AMT1 is highly correlated with BILL_AMT2 and 4 other fieldsHigh correlation
BILL_AMT2 is highly correlated with BILL_AMT1 and 4 other fieldsHigh correlation
BILL_AMT3 is highly correlated with BILL_AMT1 and 4 other fieldsHigh correlation
BILL_AMT4 is highly correlated with BILL_AMT1 and 4 other fieldsHigh correlation
BILL_AMT5 is highly correlated with BILL_AMT1 and 4 other fieldsHigh correlation
BILL_AMT6 is highly correlated with BILL_AMT1 and 4 other fieldsHigh correlation
PAY_0 is highly correlated with PAY_2 and 1 other fieldsHigh correlation
PAY_2 is highly correlated with PAY_0 and 4 other fieldsHigh correlation
PAY_3 is highly correlated with PAY_0 and 4 other fieldsHigh correlation
PAY_4 is highly correlated with PAY_2 and 3 other fieldsHigh correlation
PAY_5 is highly correlated with PAY_2 and 4 other fieldsHigh correlation
PAY_6 is highly correlated with PAY_2 and 5 other fieldsHigh correlation
BILL_AMT1 is highly correlated with BILL_AMT2 and 4 other fieldsHigh correlation
BILL_AMT2 is highly correlated with BILL_AMT1 and 5 other fieldsHigh correlation
BILL_AMT3 is highly correlated with BILL_AMT1 and 5 other fieldsHigh correlation
BILL_AMT4 is highly correlated with PAY_5 and 5 other fieldsHigh correlation
BILL_AMT5 is highly correlated with PAY_6 and 6 other fieldsHigh correlation
BILL_AMT6 is highly correlated with PAY_6 and 6 other fieldsHigh correlation
PAY_AMT1 is highly correlated with BILL_AMT2High correlation
PAY_AMT2 is highly correlated with BILL_AMT3High correlation
PAY_AMT4 is highly correlated with BILL_AMT5High correlation
PAY_AMT5 is highly correlated with BILL_AMT6High correlation
LIMIT_BAL is highly correlated with BILL_AMT1 and 5 other fieldsHigh correlation
PAY_0 is highly correlated with PAY_2 and 5 other fieldsHigh correlation
PAY_2 is highly correlated with PAY_0 and 4 other fieldsHigh correlation
PAY_3 is highly correlated with PAY_0 and 4 other fieldsHigh correlation
PAY_4 is highly correlated with PAY_0 and 4 other fieldsHigh correlation
PAY_5 is highly correlated with PAY_0 and 5 other fieldsHigh correlation
PAY_6 is highly correlated with PAY_0 and 5 other fieldsHigh correlation
BILL_AMT1 is highly correlated with LIMIT_BAL and 6 other fieldsHigh correlation
BILL_AMT2 is highly correlated with LIMIT_BAL and 6 other fieldsHigh correlation
BILL_AMT3 is highly correlated with BILL_AMT1 and 6 other fieldsHigh correlation
BILL_AMT4 is highly correlated with LIMIT_BAL and 6 other fieldsHigh correlation
BILL_AMT5 is highly correlated with LIMIT_BAL and 8 other fieldsHigh correlation
BILL_AMT6 is highly correlated with LIMIT_BAL and 6 other fieldsHigh correlation
PAY_AMT1 is highly correlated with PAY_AMT2 and 2 other fieldsHigh correlation
PAY_AMT2 is highly correlated with BILL_AMT3 and 3 other fieldsHigh correlation
PAY_AMT3 is highly correlated with LIMIT_BAL and 8 other fieldsHigh correlation
PAY_AMT4 is highly correlated with PAY_AMT1 and 1 other fieldsHigh correlation
PAY_AMT5 is highly correlated with BILL_AMT3 and 1 other fieldsHigh correlation
default.payment.next.month is highly correlated with PAY_0High correlation
PAY_AMT2 is highly skewed (γ1 = 30.43861292) Skewed
df_index is uniformly distributed Uniform
df_index has unique values Unique
PAY_0 has 14737 (49.2%) zeros Zeros
PAY_2 has 15730 (52.5%) zeros Zeros
PAY_3 has 15764 (52.6%) zeros Zeros
PAY_4 has 16455 (54.9%) zeros Zeros
PAY_5 has 16947 (56.6%) zeros Zeros
PAY_6 has 16286 (54.4%) zeros Zeros
BILL_AMT1 has 1978 (6.6%) zeros Zeros
BILL_AMT2 has 2476 (8.3%) zeros Zeros
BILL_AMT3 has 2840 (9.5%) zeros Zeros
BILL_AMT4 has 3165 (10.6%) zeros Zeros
BILL_AMT5 has 3476 (11.6%) zeros Zeros
BILL_AMT6 has 3990 (13.3%) zeros Zeros
PAY_AMT1 has 5218 (17.4%) zeros Zeros
PAY_AMT2 has 5365 (17.9%) zeros Zeros
PAY_AMT3 has 5937 (19.8%) zeros Zeros
PAY_AMT4 has 6377 (21.3%) zeros Zeros
PAY_AMT5 has 6672 (22.3%) zeros Zeros
PAY_AMT6 has 7142 (23.8%) zeros Zeros

Reproduction

Analysis started2022-04-05 18:25:57.234104
Analysis finished2022-04-05 18:27:51.360836
Duration1 minute and 54.13 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct29965
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14993.93252
Minimum0
Maximum29999
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size234.2 KiB

Quantile statistics

Minimum0
5-th percentile1498.2
Q17496
median14991
Q322493
95-th percentile28495.8
Maximum29999
Range29999
Interquartile range (IQR)14997

Descriptive statistics

Standard deviation8659.328323
Coefficient of variation (CV)0.5775221618
Kurtosis-1.199841197
Mean14993.93252
Median Absolute Deviation (MAD)7499
Skewness0.0005462130641
Sum449293188
Variance74983967.01
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
199901
 
< 0.1%
200021
 
< 0.1%
200011
 
< 0.1%
200001
 
< 0.1%
199991
 
< 0.1%
199981
 
< 0.1%
199971
 
< 0.1%
199961
 
< 0.1%
199951
 
< 0.1%
Other values (29955)29955
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
299991
< 0.1%
299981
< 0.1%
299971
< 0.1%
299961
< 0.1%
299951
< 0.1%
299941
< 0.1%
299931
< 0.1%
299921
< 0.1%
299911
< 0.1%
299901
< 0.1%

LIMIT_BAL
Real number (ℝ≥0)

HIGH CORRELATION

Distinct81
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean167442.005
Minimum10000
Maximum1000000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size234.2 KiB

Quantile statistics

Minimum10000
5-th percentile20000
Q150000
median140000
Q3240000
95-th percentile430000
Maximum1000000
Range990000
Interquartile range (IQR)190000

Descriptive statistics

Standard deviation129760.1352
Coefficient of variation (CV)0.7749556942
Kurtosis0.5375871217
Mean167442.005
Median Absolute Deviation (MAD)90000
Skewness0.9934913272
Sum5017399680
Variance1.683769269 × 1010
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
500003363
 
11.2%
200001975
 
6.6%
300001610
 
5.4%
800001564
 
5.2%
2000001524
 
5.1%
1500001107
 
3.7%
1000001047
 
3.5%
180000993
 
3.3%
360000874
 
2.9%
60000825
 
2.8%
Other values (71)15083
50.3%
ValueCountFrequency (%)
10000493
 
1.6%
160002
 
< 0.1%
200001975
6.6%
300001610
5.4%
40000230
 
0.8%
500003363
11.2%
60000825
 
2.8%
70000731
 
2.4%
800001564
5.2%
90000650
 
2.2%
ValueCountFrequency (%)
10000001
 
< 0.1%
8000002
 
< 0.1%
7800002
 
< 0.1%
7600001
 
< 0.1%
7500004
< 0.1%
7400002
 
< 0.1%
7300002
 
< 0.1%
7200003
 
< 0.1%
7100006
< 0.1%
7000008
< 0.1%

SEX
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.2 KiB
2
18091 
1
11874 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row1

Common Values

ValueCountFrequency (%)
218091
60.4%
111874
39.6%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
218091
60.4%
111874
39.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

EDUCATION
Real number (ℝ≥0)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.853629234
Minimum0
Maximum6
Zeros14
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size234.2 KiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q32
95-th percentile3
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7904114716
Coefficient of variation (CV)0.4264129293
Kurtosis2.079207034
Mean1.853629234
Median Absolute Deviation (MAD)1
Skewness0.9707092745
Sum55544
Variance0.6247502944
MonotonicityNot monotonic
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
214019
46.8%
110563
35.3%
34915
 
16.4%
5280
 
0.9%
4123
 
0.4%
651
 
0.2%
014
 
< 0.1%
ValueCountFrequency (%)
014
 
< 0.1%
110563
35.3%
214019
46.8%
34915
 
16.4%
4123
 
0.4%
5280
 
0.9%
651
 
0.2%
ValueCountFrequency (%)
651
 
0.2%
5280
 
0.9%
4123
 
0.4%
34915
 
16.4%
214019
46.8%
110563
35.3%
014
 
< 0.1%

MARRIAGE
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.2 KiB
2
15945 
1
13643 
3
 
323
0
 
54

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
215945
53.2%
113643
45.5%
3323
 
1.1%
054
 
0.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
215945
53.2%
113643
45.5%
3323
 
1.1%
054
 
0.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

AGE
Real number (ℝ≥0)

Distinct56
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.4879693
Minimum21
Maximum79
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size234.2 KiB

Quantile statistics

Minimum21
5-th percentile23
Q128
median34
Q341
95-th percentile53
Maximum79
Range58
Interquartile range (IQR)13

Descriptive statistics

Standard deviation9.219459233
Coefficient of variation (CV)0.2597911184
Kurtosis0.04398801494
Mean35.4879693
Median Absolute Deviation (MAD)6
Skewness0.7320560019
Sum1063397
Variance84.99842855
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
291602
 
5.3%
271475
 
4.9%
281406
 
4.7%
301394
 
4.7%
261252
 
4.2%
311213
 
4.0%
251185
 
4.0%
341161
 
3.9%
321157
 
3.9%
331146
 
3.8%
Other values (46)16974
56.6%
ValueCountFrequency (%)
2167
 
0.2%
22560
 
1.9%
23930
3.1%
241126
3.8%
251185
4.0%
261252
4.2%
271475
4.9%
281406
4.7%
291602
5.3%
301394
4.7%
ValueCountFrequency (%)
791
 
< 0.1%
753
 
< 0.1%
741
 
< 0.1%
734
 
< 0.1%
723
 
< 0.1%
713
 
< 0.1%
7010
< 0.1%
6915
0.1%
685
 
< 0.1%
6716
0.1%

PAY_0
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.01675287836
Minimum-2
Maximum8
Zeros14737
Zeros (%)49.2%
Negative8432
Negative (%)28.1%
Memory size234.2 KiB

Quantile statistics

Minimum-2
5-th percentile-2
Q1-1
median0
Q30
95-th percentile2
Maximum8
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.123492034
Coefficient of variation (CV)-67.06262707
Kurtosis2.730038381
Mean-0.01675287836
Median Absolute Deviation (MAD)1
Skewness0.7346064765
Sum-502
Variance1.26223435
MonotonicityNot monotonic
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
014737
49.2%
-15682
 
19.0%
13667
 
12.2%
-22750
 
9.2%
22666
 
8.9%
3322
 
1.1%
476
 
0.3%
526
 
0.1%
819
 
0.1%
611
 
< 0.1%
ValueCountFrequency (%)
-22750
 
9.2%
-15682
 
19.0%
014737
49.2%
13667
 
12.2%
22666
 
8.9%
3322
 
1.1%
476
 
0.3%
526
 
0.1%
611
 
< 0.1%
79
 
< 0.1%
ValueCountFrequency (%)
819
 
0.1%
79
 
< 0.1%
611
 
< 0.1%
526
 
0.1%
476
 
0.3%
3322
 
1.1%
22666
 
8.9%
13667
 
12.2%
014737
49.2%
-15682
 
19.0%

PAY_2
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.1318538295
Minimum-2
Maximum8
Zeros15730
Zeros (%)52.5%
Negative9798
Negative (%)32.7%
Memory size234.2 KiB

Quantile statistics

Minimum-2
5-th percentile-2
Q1-1
median0
Q30
95-th percentile2
Maximum8
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.196321699
Coefficient of variation (CV)-9.073090281
Kurtosis1.577608705
Mean-0.1318538295
Median Absolute Deviation (MAD)0
Skewness0.7920704147
Sum-3951
Variance1.431185607
MonotonicityNot monotonic
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
015730
52.5%
-16046
 
20.2%
23926
 
13.1%
-23752
 
12.5%
3326
 
1.1%
499
 
0.3%
128
 
0.1%
525
 
0.1%
720
 
0.1%
612
 
< 0.1%
ValueCountFrequency (%)
-23752
 
12.5%
-16046
 
20.2%
015730
52.5%
128
 
0.1%
23926
 
13.1%
3326
 
1.1%
499
 
0.3%
525
 
0.1%
612
 
< 0.1%
720
 
0.1%
ValueCountFrequency (%)
81
 
< 0.1%
720
 
0.1%
612
 
< 0.1%
525
 
0.1%
499
 
0.3%
3326
 
1.1%
23926
 
13.1%
128
 
0.1%
015730
52.5%
-16046
 
20.2%

PAY_3
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.1643917904
Minimum-2
Maximum8
Zeros15764
Zeros (%)52.6%
Negative9989
Negative (%)33.3%
Memory size234.2 KiB

Quantile statistics

Minimum-2
5-th percentile-2
Q1-1
median0
Q30
95-th percentile2
Maximum8
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.195877509
Coefficient of variation (CV)-7.274557358
Kurtosis2.091665951
Mean-0.1643917904
Median Absolute Deviation (MAD)0
Skewness0.8414639808
Sum-4926
Variance1.430123016
MonotonicityNot monotonic
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
015764
52.6%
-15934
 
19.8%
-24055
 
13.5%
23819
 
12.7%
3240
 
0.8%
475
 
0.3%
727
 
0.1%
623
 
0.1%
521
 
0.1%
14
 
< 0.1%
ValueCountFrequency (%)
-24055
 
13.5%
-15934
 
19.8%
015764
52.6%
14
 
< 0.1%
23819
 
12.7%
3240
 
0.8%
475
 
0.3%
521
 
0.1%
623
 
0.1%
727
 
0.1%
ValueCountFrequency (%)
83
 
< 0.1%
727
 
0.1%
623
 
0.1%
521
 
0.1%
475
 
0.3%
3240
 
0.8%
23819
 
12.7%
14
 
< 0.1%
015764
52.6%
-15934
 
19.8%

PAY_4
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.2189220758
Minimum-2
Maximum8
Zeros16455
Zeros (%)54.9%
Negative10001
Negative (%)33.4%
Memory size234.2 KiB

Quantile statistics

Minimum-2
5-th percentile-2
Q1-1
median0
Q30
95-th percentile2
Maximum8
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.168175186
Coefficient of variation (CV)-5.33603193
Kurtosis3.508962108
Mean-0.2189220758
Median Absolute Deviation (MAD)0
Skewness1.000798562
Sum-6560
Variance1.364633266
MonotonicityNot monotonic
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
016455
54.9%
-15683
 
19.0%
-24318
 
14.4%
23159
 
10.5%
3180
 
0.6%
468
 
0.2%
758
 
0.2%
535
 
0.1%
65
 
< 0.1%
12
 
< 0.1%
ValueCountFrequency (%)
-24318
 
14.4%
-15683
 
19.0%
016455
54.9%
12
 
< 0.1%
23159
 
10.5%
3180
 
0.6%
468
 
0.2%
535
 
0.1%
65
 
< 0.1%
758
 
0.2%
ValueCountFrequency (%)
82
 
< 0.1%
758
 
0.2%
65
 
< 0.1%
535
 
0.1%
468
 
0.2%
3180
 
0.6%
23159
 
10.5%
12
 
< 0.1%
016455
54.9%
-15683
 
19.0%

PAY_5
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.2645085934
Minimum-2
Maximum8
Zeros16947
Zeros (%)56.6%
Negative10051
Negative (%)33.5%
Memory size234.2 KiB

Quantile statistics

Minimum-2
5-th percentile-2
Q1-1
median0
Q30
95-th percentile2
Maximum8
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.132219856
Coefficient of variation (CV)-4.280465302
Kurtosis4.003562263
Mean-0.2645085934
Median Absolute Deviation (MAD)0
Skewness1.009329021
Sum-7926
Variance1.281921802
MonotonicityNot monotonic
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
016947
56.6%
-15535
 
18.5%
-24516
 
15.1%
22626
 
8.8%
3178
 
0.6%
483
 
0.3%
758
 
0.2%
517
 
0.1%
64
 
< 0.1%
81
 
< 0.1%
ValueCountFrequency (%)
-24516
 
15.1%
-15535
 
18.5%
016947
56.6%
22626
 
8.8%
3178
 
0.6%
483
 
0.3%
517
 
0.1%
64
 
< 0.1%
758
 
0.2%
81
 
< 0.1%
ValueCountFrequency (%)
81
 
< 0.1%
758
 
0.2%
64
 
< 0.1%
517
 
0.1%
483
 
0.3%
3178
 
0.6%
22626
 
8.8%
016947
56.6%
-15535
 
18.5%
-24516
 
15.1%

PAY_6
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.2894376773
Minimum-2
Maximum8
Zeros16286
Zeros (%)54.4%
Negative10601
Negative (%)35.4%
Memory size234.2 KiB

Quantile statistics

Minimum-2
5-th percentile-2
Q1-1
median0
Q30
95-th percentile2
Maximum8
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1490901
Coefficient of variation (CV)-3.970077809
Kurtosis3.437256875
Mean-0.2894376773
Median Absolute Deviation (MAD)0
Skewness0.9486089933
Sum-8673
Variance1.320408057
MonotonicityNot monotonic
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
016286
54.4%
-15736
 
19.1%
-24865
 
16.2%
22766
 
9.2%
3184
 
0.6%
448
 
0.2%
746
 
0.2%
619
 
0.1%
513
 
< 0.1%
82
 
< 0.1%
ValueCountFrequency (%)
-24865
 
16.2%
-15736
 
19.1%
016286
54.4%
22766
 
9.2%
3184
 
0.6%
448
 
0.2%
513
 
< 0.1%
619
 
0.1%
746
 
0.2%
82
 
< 0.1%
ValueCountFrequency (%)
82
 
< 0.1%
746
 
0.2%
619
 
0.1%
513
 
< 0.1%
448
 
0.2%
3184
 
0.6%
22766
 
9.2%
016286
54.4%
-15736
 
19.1%
-24865
 
16.2%

BILL_AMT1
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct22723
Distinct (%)75.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51283.00978
Minimum-165580
Maximum964511
Zeros1978
Zeros (%)6.6%
Negative590
Negative (%)2.0%
Memory size234.2 KiB

Quantile statistics

Minimum-165580
5-th percentile0
Q13595
median22438
Q367260
95-th percentile201303.8
Maximum964511
Range1130091
Interquartile range (IQR)63665

Descriptive statistics

Standard deviation73658.1324
Coefficient of variation (CV)1.436306736
Kurtosis9.796846218
Mean51283.00978
Median Absolute Deviation (MAD)21842
Skewness2.662513456
Sum1536695388
Variance5425520469
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01978
 
6.6%
390243
 
0.8%
78076
 
0.3%
32672
 
0.2%
31663
 
0.2%
250059
 
0.2%
39648
 
0.2%
240039
 
0.1%
41629
 
0.1%
50025
 
0.1%
Other values (22713)27333
91.2%
ValueCountFrequency (%)
-1655801
< 0.1%
-1549731
< 0.1%
-153081
< 0.1%
-143861
< 0.1%
-115451
< 0.1%
-106821
< 0.1%
-98021
< 0.1%
-90951
< 0.1%
-81871
< 0.1%
-74381
< 0.1%
ValueCountFrequency (%)
9645111
< 0.1%
7468141
< 0.1%
6530621
< 0.1%
6304581
< 0.1%
6266481
< 0.1%
6217491
< 0.1%
6138601
< 0.1%
6107231
< 0.1%
6085941
< 0.1%
6040191
< 0.1%

BILL_AMT2
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct22346
Distinct (%)74.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49236.36629
Minimum-69777
Maximum983931
Zeros2476
Zeros (%)8.3%
Negative669
Negative (%)2.2%
Memory size234.2 KiB

Quantile statistics

Minimum-69777
5-th percentile0
Q13010
median21295
Q364109
95-th percentile194889.6
Maximum983931
Range1053708
Interquartile range (IQR)61099

Descriptive statistics

Standard deviation71195.56739
Coefficient of variation (CV)1.445995567
Kurtosis10.29321199
Mean49236.36629
Median Absolute Deviation (MAD)20905
Skewness2.70386174
Sum1475367716
Variance5068808816
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02476
 
8.3%
390230
 
0.8%
32675
 
0.3%
78075
 
0.3%
31672
 
0.2%
250051
 
0.2%
39650
 
0.2%
240042
 
0.1%
-20029
 
0.1%
41628
 
0.1%
Other values (22336)26837
89.6%
ValueCountFrequency (%)
-697771
< 0.1%
-675261
< 0.1%
-333501
< 0.1%
-300001
< 0.1%
-262141
< 0.1%
-247041
< 0.1%
-247021
< 0.1%
-229601
< 0.1%
-186181
< 0.1%
-180881
< 0.1%
ValueCountFrequency (%)
9839311
< 0.1%
7439701
< 0.1%
6715631
< 0.1%
6467701
< 0.1%
6244751
< 0.1%
6059431
< 0.1%
5977931
< 0.1%
5868251
< 0.1%
5817751
< 0.1%
5776811
< 0.1%

BILL_AMT3
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct22026
Distinct (%)73.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47067.91607
Minimum-157264
Maximum1664089
Zeros2840
Zeros (%)9.5%
Negative655
Negative (%)2.2%
Memory size234.2 KiB

Quantile statistics

Minimum-157264
5-th percentile0
Q12711
median20135
Q360201
95-th percentile187901
Maximum1664089
Range1821353
Interquartile range (IQR)57490

Descriptive statistics

Standard deviation69371.35232
Coefficient of variation (CV)1.473856464
Kurtosis19.77100256
Mean47067.91607
Median Absolute Deviation (MAD)19745
Skewness3.086493832
Sum1410390105
Variance4812384523
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02840
 
9.5%
390274
 
0.9%
78074
 
0.2%
32663
 
0.2%
31662
 
0.2%
39647
 
0.2%
250040
 
0.1%
240039
 
0.1%
41629
 
0.1%
20027
 
0.1%
Other values (22016)26470
88.3%
ValueCountFrequency (%)
-1572641
< 0.1%
-615061
< 0.1%
-461271
< 0.1%
-340411
< 0.1%
-254431
< 0.1%
-247021
< 0.1%
-203201
< 0.1%
-177061
< 0.1%
-159101
< 0.1%
-156411
< 0.1%
ValueCountFrequency (%)
16640891
< 0.1%
8550861
< 0.1%
6931311
< 0.1%
6896431
< 0.1%
6896271
< 0.1%
6320411
< 0.1%
5974151
< 0.1%
5789711
< 0.1%
5779571
< 0.1%
5770151
< 0.1%

BILL_AMT4
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct21548
Distinct (%)71.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43313.32988
Minimum-170000
Maximum891586
Zeros3165
Zeros (%)10.6%
Negative675
Negative (%)2.3%
Memory size234.2 KiB

Quantile statistics

Minimum-170000
5-th percentile0
Q12360
median19081
Q354601
95-th percentile174469.8
Maximum891586
Range1061586
Interquartile range (IQR)52241

Descriptive statistics

Standard deviation64353.51437
Coefficient of variation (CV)1.485766958
Kurtosis11.29858229
Mean43313.32988
Median Absolute Deviation (MAD)18681
Skewness2.820544832
Sum1297883930
Variance4141374812
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03165
 
10.6%
390245
 
0.8%
780101
 
0.3%
31668
 
0.2%
32662
 
0.2%
39643
 
0.1%
240039
 
0.1%
15039
 
0.1%
250034
 
0.1%
41633
 
0.1%
Other values (21538)26136
87.2%
ValueCountFrequency (%)
-1700001
< 0.1%
-813341
< 0.1%
-651671
< 0.1%
-506161
< 0.1%
-466271
< 0.1%
-345031
< 0.1%
-274901
< 0.1%
-243031
< 0.1%
-221081
< 0.1%
-203201
< 0.1%
ValueCountFrequency (%)
8915861
< 0.1%
7068641
< 0.1%
6286991
< 0.1%
6168361
< 0.1%
5728051
< 0.1%
5690341
< 0.1%
5656691
< 0.1%
5635431
< 0.1%
5480201
< 0.1%
5426531
< 0.1%

BILL_AMT5
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct21010
Distinct (%)70.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40358.33439
Minimum-81334
Maximum927171
Zeros3476
Zeros (%)11.6%
Negative655
Negative (%)2.2%
Memory size234.2 KiB

Quantile statistics

Minimum-81334
5-th percentile0
Q11787
median18130
Q350247
95-th percentile165805.6
Maximum927171
Range1008505
Interquartile range (IQR)48460

Descriptive statistics

Standard deviation60817.13062
Coefficient of variation (CV)1.506928657
Kurtosis12.29453891
Mean40358.33439
Median Absolute Deviation (MAD)17714
Skewness2.874925049
Sum1209337490
Variance3698723377
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03476
 
11.6%
390234
 
0.8%
78094
 
0.3%
31679
 
0.3%
32662
 
0.2%
15058
 
0.2%
39646
 
0.2%
240039
 
0.1%
250037
 
0.1%
41636
 
0.1%
Other values (21000)25804
86.1%
ValueCountFrequency (%)
-813341
< 0.1%
-613721
< 0.1%
-530071
< 0.1%
-466271
< 0.1%
-375941
< 0.1%
-361561
< 0.1%
-304811
< 0.1%
-283351
< 0.1%
-230031
< 0.1%
-207531
< 0.1%
ValueCountFrequency (%)
9271711
< 0.1%
8235401
< 0.1%
5870671
< 0.1%
5517021
< 0.1%
5478801
< 0.1%
5306721
< 0.1%
5243151
< 0.1%
5161391
< 0.1%
5141141
< 0.1%
5082131
< 0.1%

BILL_AMT6
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct20604
Distinct (%)68.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38917.01228
Minimum-339603
Maximum961664
Zeros3990
Zeros (%)13.3%
Negative688
Negative (%)2.3%
Memory size234.2 KiB

Quantile statistics

Minimum-339603
5-th percentile0
Q11262
median17124
Q349252
95-th percentile161932
Maximum961664
Range1301267
Interquartile range (IQR)47990

Descriptive statistics

Standard deviation59574.14774
Coefficient of variation (CV)1.530799623
Kurtosis12.25912611
Mean38917.01228
Median Absolute Deviation (MAD)16808
Skewness2.845137169
Sum1166148273
Variance3549079079
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03990
 
13.3%
390206
 
0.7%
78086
 
0.3%
15078
 
0.3%
31677
 
0.3%
32656
 
0.2%
39644
 
0.1%
41636
 
0.1%
-1833
 
0.1%
240032
 
0.1%
Other values (20594)25327
84.5%
ValueCountFrequency (%)
-3396031
< 0.1%
-2090511
< 0.1%
-1509531
< 0.1%
-946251
< 0.1%
-738951
< 0.1%
-570601
< 0.1%
-514431
< 0.1%
-511831
< 0.1%
-466271
< 0.1%
-457341
< 0.1%
ValueCountFrequency (%)
9616641
< 0.1%
6999441
< 0.1%
5686381
< 0.1%
5277111
< 0.1%
5275661
< 0.1%
5149751
< 0.1%
5137981
< 0.1%
5119051
< 0.1%
5013701
< 0.1%
4991001
< 0.1%

PAY_AMT1
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct7943
Distinct (%)26.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5670.099316
Minimum0
Maximum873552
Zeros5218
Zeros (%)17.4%
Negative0
Negative (%)0.0%
Memory size234.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11000
median2102
Q35008
95-th percentile18447.2
Maximum873552
Range873552
Interquartile range (IQR)4008

Descriptive statistics

Standard deviation16571.84947
Coefficient of variation (CV)2.92267358
Kurtosis414.8548633
Mean5670.099316
Median Absolute Deviation (MAD)1929
Skewness14.66159454
Sum169904526
Variance274626194.7
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
05218
 
17.4%
20001363
 
4.5%
3000891
 
3.0%
5000698
 
2.3%
1500507
 
1.7%
4000426
 
1.4%
10000401
 
1.3%
1000365
 
1.2%
2500298
 
1.0%
6000294
 
1.0%
Other values (7933)19504
65.1%
ValueCountFrequency (%)
05218
17.4%
19
 
< 0.1%
214
 
< 0.1%
315
 
0.1%
418
 
0.1%
512
 
< 0.1%
615
 
0.1%
79
 
< 0.1%
88
 
< 0.1%
97
 
< 0.1%
ValueCountFrequency (%)
8735521
< 0.1%
5050001
< 0.1%
4933581
< 0.1%
4239031
< 0.1%
4050161
< 0.1%
3681991
< 0.1%
3230141
< 0.1%
3048151
< 0.1%
3020001
< 0.1%
3000391
< 0.1%

PAY_AMT2
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct7899
Distinct (%)26.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5927.98318
Minimum0
Maximum1684259
Zeros5365
Zeros (%)17.9%
Negative0
Negative (%)0.0%
Memory size234.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q1850
median2010
Q35000
95-th percentile19030.8
Maximum1684259
Range1684259
Interquartile range (IQR)4150

Descriptive statistics

Standard deviation23053.45664
Coefficient of variation (CV)3.888920724
Kurtosis1639.924451
Mean5927.98318
Median Absolute Deviation (MAD)1990
Skewness30.43861292
Sum177632016
Variance531461863.3
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
05365
 
17.9%
20001290
 
4.3%
3000857
 
2.9%
5000717
 
2.4%
1000594
 
2.0%
1500521
 
1.7%
4000410
 
1.4%
10000318
 
1.1%
6000283
 
0.9%
2500251
 
0.8%
Other values (7889)19359
64.6%
ValueCountFrequency (%)
05365
17.9%
115
 
0.1%
220
 
0.1%
318
 
0.1%
411
 
< 0.1%
525
 
0.1%
68
 
< 0.1%
712
 
< 0.1%
89
 
< 0.1%
96
 
< 0.1%
ValueCountFrequency (%)
16842591
< 0.1%
12270821
< 0.1%
12154711
< 0.1%
10245161
< 0.1%
5804641
< 0.1%
4155521
< 0.1%
4010031
< 0.1%
3881261
< 0.1%
3852281
< 0.1%
3849861
< 0.1%

PAY_AMT3
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct7518
Distinct (%)25.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5231.688837
Minimum0
Maximum896040
Zeros5937
Zeros (%)19.8%
Negative0
Negative (%)0.0%
Memory size234.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q1390
median1804
Q34512
95-th percentile17602.6
Maximum896040
Range896040
Interquartile range (IQR)4122

Descriptive statistics

Standard deviation17616.36112
Coefficient of variation (CV)3.367241759
Kurtosis563.7392771
Mean5231.688837
Median Absolute Deviation (MAD)1796
Skewness17.2081766
Sum156767556
Variance310336179.3
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
05937
 
19.8%
20001285
 
4.3%
10001103
 
3.7%
3000870
 
2.9%
5000721
 
2.4%
1500490
 
1.6%
4000381
 
1.3%
10000312
 
1.0%
1200243
 
0.8%
6000241
 
0.8%
Other values (7508)18382
61.3%
ValueCountFrequency (%)
05937
19.8%
113
 
< 0.1%
219
 
0.1%
314
 
< 0.1%
415
 
0.1%
518
 
0.1%
614
 
< 0.1%
718
 
0.1%
810
 
< 0.1%
912
 
< 0.1%
ValueCountFrequency (%)
8960401
< 0.1%
8890431
< 0.1%
5082291
< 0.1%
4175881
< 0.1%
4009721
< 0.1%
3970921
< 0.1%
3804781
< 0.1%
3717181
< 0.1%
3493951
< 0.1%
3442611
< 0.1%

PAY_AMT4
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct6937
Distinct (%)23.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4831.617454
Minimum0
Maximum621000
Zeros6377
Zeros (%)21.3%
Negative0
Negative (%)0.0%
Memory size234.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q1300
median1500
Q34016
95-th percentile16037
Maximum621000
Range621000
Interquartile range (IQR)3716

Descriptive statistics

Standard deviation15674.46454
Coefficient of variation (CV)3.244144365
Kurtosis277.0486932
Mean4831.617454
Median Absolute Deviation (MAD)1500
Skewness12.89850649
Sum144779417
Variance245688838.5
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06377
 
21.3%
10001394
 
4.7%
20001214
 
4.1%
3000887
 
3.0%
5000810
 
2.7%
1500441
 
1.5%
4000402
 
1.3%
10000341
 
1.1%
2500259
 
0.9%
500258
 
0.9%
Other values (6927)17582
58.7%
ValueCountFrequency (%)
06377
21.3%
122
 
0.1%
222
 
0.1%
313
 
< 0.1%
420
 
0.1%
512
 
< 0.1%
616
 
0.1%
711
 
< 0.1%
87
 
< 0.1%
99
 
< 0.1%
ValueCountFrequency (%)
6210001
< 0.1%
5288971
< 0.1%
4970001
< 0.1%
4321301
< 0.1%
4000461
< 0.1%
3317881
< 0.1%
3309821
< 0.1%
3200081
< 0.1%
3130941
< 0.1%
2929621
< 0.1%

PAY_AMT5
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct6897
Distinct (%)23.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4804.897047
Minimum0
Maximum426529
Zeros6672
Zeros (%)22.3%
Negative0
Negative (%)0.0%
Memory size234.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q1261
median1500
Q34042
95-th percentile16000
Maximum426529
Range426529
Interquartile range (IQR)3781

Descriptive statistics

Standard deviation15286.3723
Coefficient of variation (CV)3.181415158
Kurtosis179.8752095
Mean4804.897047
Median Absolute Deviation (MAD)1500
Skewness11.12174174
Sum143978740
Variance233673178
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06672
 
22.3%
10001340
 
4.5%
20001323
 
4.4%
3000947
 
3.2%
5000814
 
2.7%
1500426
 
1.4%
4000401
 
1.3%
10000343
 
1.1%
500250
 
0.8%
6000247
 
0.8%
Other values (6887)17202
57.4%
ValueCountFrequency (%)
06672
22.3%
121
 
0.1%
213
 
< 0.1%
313
 
< 0.1%
412
 
< 0.1%
59
 
< 0.1%
67
 
< 0.1%
79
 
< 0.1%
86
 
< 0.1%
96
 
< 0.1%
ValueCountFrequency (%)
4265291
< 0.1%
4179901
< 0.1%
3880711
< 0.1%
3792671
< 0.1%
3320001
< 0.1%
3317881
< 0.1%
3309821
< 0.1%
3268891
< 0.1%
3170771
< 0.1%
3101351
< 0.1%

PAY_AMT6
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct6939
Distinct (%)23.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5221.498014
Minimum0
Maximum528666
Zeros7142
Zeros (%)23.8%
Negative0
Negative (%)0.0%
Memory size234.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q1131
median1500
Q34000
95-th percentile17384.4
Maximum528666
Range528666
Interquartile range (IQR)3869

Descriptive statistics

Standard deviation17786.97686
Coefficient of variation (CV)3.406489252
Kurtosis166.9817897
Mean5221.498014
Median Absolute Deviation (MAD)1500
Skewness10.63509397
Sum156462188
Variance316376546
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
07142
23.8%
10001299
 
4.3%
20001295
 
4.3%
3000914
 
3.1%
5000808
 
2.7%
1500439
 
1.5%
4000411
 
1.4%
10000356
 
1.2%
500247
 
0.8%
6000220
 
0.7%
Other values (6929)16834
56.2%
ValueCountFrequency (%)
07142
23.8%
120
 
0.1%
29
 
< 0.1%
314
 
< 0.1%
412
 
< 0.1%
57
 
< 0.1%
66
 
< 0.1%
75
 
< 0.1%
86
 
< 0.1%
97
 
< 0.1%
ValueCountFrequency (%)
5286661
< 0.1%
5271431
< 0.1%
4430011
< 0.1%
4220001
< 0.1%
4035001
< 0.1%
3770001
< 0.1%
3724951
< 0.1%
3512821
< 0.1%
3452931
< 0.1%
3080001
< 0.1%

default.payment.next.month
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size234.2 KiB
0
23335 
1
6630 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
023335
77.9%
16630
 
22.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
023335
77.9%
16630
 
22.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexLIMIT_BALSEXEDUCATIONMARRIAGEAGEPAY_0PAY_2PAY_3PAY_4PAY_5PAY_6BILL_AMT1BILL_AMT2BILL_AMT3BILL_AMT4BILL_AMT5BILL_AMT6PAY_AMT1PAY_AMT2PAY_AMT3PAY_AMT4PAY_AMT5PAY_AMT6default.payment.next.month
0020000.02212422-1-1-2-23913.03102.0689.00.00.00.00.0689.00.00.00.00.01
11120000.022226-1200022682.01725.02682.03272.03455.03261.00.01000.01000.01000.00.02000.01
2290000.02223400000029239.014027.013559.014331.014948.015549.01518.01500.01000.01000.01000.05000.00
3350000.02213700000046990.048233.049291.028314.028959.029547.02000.02019.01200.01100.01069.01000.00
4450000.012157-10-10008617.05670.035835.020940.019146.019131.02000.036681.010000.09000.0689.0679.00
5550000.01123700000064400.057069.057608.019394.019619.020024.02500.01815.0657.01000.01000.0800.00
66500000.011229000000367965.0412023.0445007.0542653.0483003.0473944.055000.040000.038000.020239.013750.013770.00
77100000.0222230-1-100-111876.0380.0601.0221.0-159.0567.0380.0601.00.0581.01687.01542.00
88140000.02312800200011285.014096.012108.012211.011793.03719.03329.00.0432.01000.01000.01000.00
9920000.013235-2-2-2-2-1-10.00.00.00.013007.013912.00.00.00.013007.01122.00.00

Last rows

df_indexLIMIT_BALSEXEDUCATIONMARRIAGEAGEPAY_0PAY_2PAY_3PAY_4PAY_5PAY_6BILL_AMT1BILL_AMT2BILL_AMT3BILL_AMT4BILL_AMT5BILL_AMT6PAY_AMT1PAY_AMT2PAY_AMT3PAY_AMT4PAY_AMT5PAY_AMT6default.payment.next.month
2995529990140000.012141000000138325.0137142.0139110.0138262.049675.046121.06000.07000.04228.01505.02000.02000.00
2995629991210000.0121343222222500.02500.02500.02500.02500.02500.00.00.00.00.00.00.01
299572999210000.013143000-2-2-28802.010400.00.00.00.00.02000.00.00.00.00.00.00
2995829993100000.0112380-1-10003042.01427.0102996.070626.069473.055004.02000.0111784.04000.03000.02000.02000.00
299592999480000.01223422222272557.077708.079384.077519.082607.081158.07000.03500.00.07000.00.04000.01
2996029995220000.013139000000188948.0192815.0208365.088004.031237.015980.08500.020000.05003.03047.05000.01000.00
2996129996150000.013243-1-1-1-1001683.01828.03502.08979.05190.00.01837.03526.08998.0129.00.00.00
299622999730000.012237432-1003565.03356.02758.020878.020582.019357.00.00.022000.04200.02000.03100.01
299632999880000.0131411-1000-1-1645.078379.076304.052774.011855.048944.085900.03409.01178.01926.052964.01804.01
299642999950000.01214600000047929.048905.049764.036535.032428.015313.02078.01800.01430.01000.01000.01000.01